A Semantic approach for Text Clustering using WordNet based on Multi-Objective Genetic Algorithms

نویسندگان

  • Jung Song Lee
  • Han Hee Hahm
  • Jong Joo Lee
  • Soon Cheol Park
چکیده

In this paper, we propose a method of MultiObjective Genetic Algorithms (MOGAs), NSGA-II and SPEA2, for document clustering with semantic similarity measures based on WordNet. The MOGAs showed a high performance compared to other clustering algorithms. The main problem in the application of MOGAs for document clustering in the Vector Space Model (VSM) is that it ignores relationships between important terms or words. The hierarchical structure of WordNet as thesaurus-based ontology is an effective technique, which is used in semantic similarity measure. We tested these algorithms on Reuter-21578 collection data sets and compared them with Genetic Algorithms (GA) in conjunction with the semantic similarity measures based on WordNet. Also, we used F-measure to evaluate the performance of these clustering algorithms. The experimental results show that the performance of MOGAs based on WordNet is superior to those of the other clustering algorithms in the same similarity environments. Keywords— Document Clustering, Multi-Objective Genetic Algorithm, Semantic Similarity Measure, WordNet

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Generating Optimal Timetabling for Lecturers using Hybrid Fuzzy and Clustering Algorithms

UCTTP is a NP-hard problem, which must be performed for each semester frequently. The major technique in the presented approach would be analyzing data to resolve uncertainties of lecturers’ preferences and constraints within a department in order to obtain a ranking for each lecturer based on their requirements within a department where it is attempted to increase their satisfaction and develo...

متن کامل

Using Metaheuristic Algorithms Combined with Clustering Approach to Solve a Sustainable Waste Collection Problem

Sustainability is a monumental issue that should be considered in designing a logistics system. In order to incorporate sustainability concepts in our study, a waste collection problem with economic, environmental, and social objective functions was addressed. The first objective function minimized overall costs of the system, including establishment of depots and treatment facilities. Addressi...

متن کامل

AERO-THERMODYNAMIC OPTIMIZATION OF TURBOPROP ENGINES USING MULTI-OBJECTIVE GENETIC ALGORITHMS

In this paper multi-objective genetic algorithms were employed for Pareto approach optimization of turboprop engines. The considered objective functions are used to maximize the specific thrust, propulsive efficiency, thermal efficiency, propeller efficiency and minimize the thrust specific fuel consumption. These objectives are usually conflicting with each other. The design variables consist ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015